Attentive Experience Replay (AER) uniformly samples \(\lambda\) times greater size than mini-batch size (\(k\)), then choose top \(k\) similar samples as minibatch.
The coefficient \(\lambda\) is annealed to 1 during training.
The similarity function \(\mathcal{F}(s_j,s_t)\) is task dependent. In the paper, the authors used cosine similarity for MuJoCo and norm of difference between embedded features for Atari 2600 games.
You can implement AER with simple ReplayBuffer
class. Simply you can
sample enlarged batch size then compute similarity.